fix(site): Skip file-tree walk when refreshing only database usage#6765
fix(site): Skip file-tree walk when refreshing only database usage#6765balamurali27 wants to merge 3 commits into
Conversation
A database-usage refresh went through the generic sync_info, which calls the agent's get_usage — and that recursively walks the site's entire file tree (public, private, backups) just to sum file sizes. On sites with large file trees the walk exceeds the gunicorn timeout, killing agent web workers. The dashboard polls refresh_database_usage every few seconds, so a single bloated site keeps the walk running back-to-back. Threading a database_only flag through get_site_info -> fetch_info -> sync_info fixes this: in that mode the agent is asked (via ?database_only=1) for just the cheap database size, and _sync_database_usage records a Site Usage row that carries the last-known file totals forward instead of recomputing them. Requires the matching agent change to honour ?database_only; without it the agent ignores the param and falls back to the full walk (current behaviour), so the two can be deployed in either order safely. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
| self._insert_site_usage( | ||
| { | ||
| "database": fetched_usage["database"], |
There was a problem hiding this comment.
_sync_database_usage accesses fetched_usage["database"] with a hard key lookup. If the agent returns a response that passes the if not data guard but has a missing or renamed "database" key (e.g., during a partial rollout of the matching agent change), this raises an unhandled KeyError in the refresh_database_usage code path, where there is no suppress(Exception) wrapper — surfacing a 500 to the polling dashboard every 3 s.
| self._insert_site_usage( | |
| { | |
| "database": fetched_usage["database"], | |
| database_size = fetched_usage.get("database") | |
| if database_size is None: | |
| return | |
| self._insert_site_usage( | |
| { | |
| "database": database_size, |
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/site/site.py
Line: 2114-2116
Comment:
`_sync_database_usage` accesses `fetched_usage["database"]` with a hard key lookup. If the agent returns a response that passes the `if not data` guard but has a missing or renamed `"database"` key (e.g., during a partial rollout of the matching agent change), this raises an unhandled `KeyError` in the `refresh_database_usage` code path, where there is no `suppress(Exception)` wrapper — surfacing a 500 to the polling dashboard every 3 s.
```suggestion
database_size = fetched_usage.get("database")
if database_size is None:
return
self._insert_site_usage(
{
"database": database_size,
```
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| @frappe.whitelist() | ||
| def sync_info(self, data=None): | ||
| def sync_info(self, data=None, database_only=False): |
There was a problem hiding this comment.
database_only is a raw bool default on a @frappe.whitelist() method. When invoked via the HTTP API with a query-string value such as database_only=False, Frappe passes it as the string "False", which is truthy in Python. This would silently take the fast _sync_database_usage path (skipping config/timezone sync) even when the caller intended the full sync. Consider coercing with frappe.utils.sbool(database_only) or frappe.utils.cint(database_only) at the top of the method.
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/site/site.py
Line: 2202
Comment:
`database_only` is a raw `bool` default on a `@frappe.whitelist()` method. When invoked via the HTTP API with a query-string value such as `database_only=False`, Frappe passes it as the string `"False"`, which is truthy in Python. This would silently take the fast `_sync_database_usage` path (skipping config/timezone sync) even when the caller intended the full sync. Consider coercing with `frappe.utils.sbool(database_only)` or `frappe.utils.cint(database_only)` at the top of the method.
How can I resolve this? If you propose a fix, please make it concise.Over the HTTP API a query-string value like database_only=False arrives as the string "False", which is truthy and would wrongly take the fast _sync_database_usage path. Annotating the parameter as bool lets Frappe's whitelist argument coercion convert it via sbool. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
|
recent f1-ksa stuck jobs incident |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #6765 +/- ##
============================================
- Coverage 62.94% 50.70% -12.24%
============================================
Files 117 994 +877
Lines 18110 83816 +65706
Branches 527 527
============================================
+ Hits 11399 42498 +31099
- Misses 6678 41285 +34607
Partials 33 33
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Problem
refresh_database_usage(polled by the dashboard's Site Overview / Database Analyzer pages every few seconds) goes through the genericsync_info, which calls the agent'sget_usage. That recursively walks the site's entire file tree (public, private, backups) just to sum file sizes — even though the caller only wants the database size.On sites with large file trees the walk exceeds the gunicorn timeout and kills agent web workers (
WORKER TIMEOUT). Because the dashboard re-polls every ~3s, a single bloated site keeps the walk running back-to-back.Fix
Thread a
database_onlyflag throughget_site_info→fetch_info→sync_info:?database_only=1) for just the cheap database size._sync_database_usagerecords aSite Usagerow that carries the last-known file totals forward instead of recomputing them, so disk-usage reporting is unaffected.refresh_database_usageparser-off branch and theprocess_refresh_database_usage_job_updatecallback) passdatabase_only=True.Deploy note
Paired PR: frappe/agent#548
Requires the matching agent change to honour
?database_only(frappe/agent#548). Without it the agent ignores the param and falls back to the full walk (current behaviour), so the two can be deployed in either order safely.Test
Added
test_sync_info_database_only_refreshes_db_size_and_carries_files_forward— asserts a database-only sync refreshes the DB size while carryingpublic/private/backupsforward, and thatfetch_infois called withdatabase_only=True.🤖 Generated with Claude Code